Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental GTFN Executor Caching #1197

Merged
merged 8 commits into from
Mar 17, 2023

Conversation

tehrengruber
Copy link
Contributor

@tehrengruber tehrengruber commented Mar 6, 2023

A small caching mechanism that makes repeated execution of stencils with the GTFN backend much faster. We should integrate this in a better way in the OTF pipeline, but this makes live much easier for now. I have left the caching strategy as SESSION for now, but for people that know what they are doing running with cache.Strategy.PERSISTENT can make executing all tests a breeze (time-wise :-)).

The caching of the GTFN executor can now be configured:

use_caching: bool = False
caching_strategy = cache.Strategy.SESSION

This feature is experimental!

Extracted from #965

Copy link
Contributor

@DropD DropD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there are no tests that allow us to say with any confidence that this caching has no problem with false positives, there should be at least a way to switch it off. This should probably then be the default case on CI, if not the default case overall.

Even if it looks obvious now that it should do the right thing, there is nothing in place that ensures we would notice if it gets broken down the line.

@tehrengruber
Copy link
Contributor Author

I fully agree, do you have a proposal on how to disable the caching without to much effort? We could make the caching an option in the GTFNTranslator, similar to use_imperative_backend, and make it default to False. If we also make the cache.Strategy a configurable option, this would fulfill all requirements with respect to caching I have currently. In the long term we should streamline the mechanism of passing backend options from the frontend, but for the time being this would be sufficient I believe.

@DropD
Copy link
Contributor

DropD commented Mar 7, 2023

I fully agree, do you have a proposal on how to disable the caching without to much effort? We could make the caching an option in the GTFNTranslator, similar to use_imperative_backend, and make it default to False. If we also make the cache.Strategy a configurable option, this would fulfill all requirements with respect to caching I have currently. In the long term we should streamline the mechanism of passing backend options from the frontend, but for the time being this would be sufficient I believe.

Let's go with the options in GTFNExecutor for now. However, we seem to be accumulating them already so we shouldn't wait too long on a workflow based solution.

@tehrengruber tehrengruber changed the title Poor man's cache for gtfn backend Experimental GTFN Executor Caching Mar 8, 2023
@tehrengruber tehrengruber requested a review from DropD March 8, 2023 15:21


@dataclasses.dataclass(frozen=True)
class CachedWorkflow(Generic[StartT, EndT]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not a Step subclass? It lives on the same level, has the same interoperability etc...

Either case, the current naming is Workflow for the concept and XyzStep for all the concrete implementations (Any workflow can be a step in a super-workflow anyway). So naming this Concrete implementation CachedWorkflow makes it potentially confusing.

src/gt4py/next/otf/workflow.py Outdated Show resolved Hide resolved
Comment on lines 83 to 84
# TODO(tehrengruber): as the frontend types contain lists they are
# not hashable. As a workaround we just use content_hash here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this comment only explains why content_hash is used here, it does not need to be a TODO. Else it should contain something like "make frontent types hashable" or at least "verify this is the best solution".

src/gt4py/next/program_processors/runners/gtfn_cpu.py Outdated Show resolved Hide resolved
@tehrengruber tehrengruber requested a review from DropD March 11, 2023 16:48
@DropD
Copy link
Contributor

DropD commented Mar 14, 2023

looks like the doctests have gone out of sync, but once they are fixed, it's ready to merge

Comment on lines 167 to 169
if hash_ in self._cache:
return self._cache[hash_]
return self._cache.setdefault(hash_, self.workflow(inp))
Copy link
Contributor Author

@tehrengruber tehrengruber Mar 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if hash_ in self._cache:
return self._cache[hash_]
return self._cache.setdefault(hash_, self.workflow(inp))
try:
result = self._cache[hash_]
except KeyError:
result = self._cache[hash_] = self.workflow(inp)
return result

suggested by @egparedes

@tehrengruber tehrengruber merged commit b802910 into GridTools:main Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants